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Recent advances in genetic engineering have provided the requisite toots to transform plants to contain 
foreign genes. It is now possible to produce plants which have unique characteristics of agronomic and crop 
processing importance. One such advantageous trait is enhanced starch and/or solids content and quality in 
various crop plants. WO 91/19806 reports the use of a gene which encodes ADPglucose pyrophosphorytase 

5 (ADPGPP), which catalyzes a key step in starch and glycogen biosynthesis. The preferred gene is from E. coli 
and the resulting enzyme is a poorly regulated, highly active variant 

Another desirable trait is the reduction of oil in certain food crops, such as peanuts. Decreasing the lipid 
content in the seeds of certain plants is desirable due to health concerns or for improved processing qualities. 
For example, a low calorie peanut butter, having a higher starch content and lower oil content would be ben- 

10 ef icial Also, soybeans having lower oil content would be better for producing certain products, such as tof u, 
soy sauce, soy meat extenders, and soy milk. In addition, lower oil content in certain seed-derived products 
is desirable, such as corn starch or wheat flour. It has surprisingly been found that such fat reduction is ac- 
complished by expression of a gene encoding a deregulated ADPGPP, such as g!gC16 in the seeds. 

15 SUMMARY OF THE INVENTION 

The present invention provides structural DNA constructs which encode an ADPglucose pyrophosphor y- 
lase (ADPGPP) enzyme and which are useful in producing seeds having a reduced oil content 

In accomplishing the foregoing, there is provided, in accordance with one aspect of the present invention, 
20 a method of producing genetically transformed plants which have elevated starch content, comprising the steps 
of: 

(a) inserting into the genome of a plant cell a recombinant, double-stranded DNA molecule comprising 

(i) a promoter which is selected from the group consisting of seed specific promoters, 

(ii) a structural DNA sequence that causes the production of an RNA sequence which encodes a fusion 
25 polypeptide comprising an amino-terminal plastid transit peptide and an ADPglucose pyrophosphory- 

lase enzyme, 

(iii) a 3' non-translated DNAsequence which functions in plant cells to cause transcriptional termination 
and the addition of polyadenylated nucleotides to the 3' end of the RNA sequence; 

(b) obtaining transformed plant cells; and 

30 (c) regenerating from the transformed plant cells genetically transformed plants which have an elevated 
starch content 

In accordance with another aspect of the present invention, there is provided a recombinant, double- 
stranded DNA molecule comprising in sequence: 

(a) a promoter which is selected from the group consisting of seed specific promoters; 
35 (b) a structural DNA sequence that causes the production of an RNA sequence which encodes a fusion 

polypeptide comprising an amino-terminal plastid transit peptide and an ADPglucose pyrophosphorytase 
enzyme; and 

(c) a 3' non-translated region which functions in plant cells to cause transcriptional termination and the 
addition of polyadenylated nucleotides to the 3' end of the RNA sequence, said promoter being heterolo- 

40 gous with respect to the structural DNA. 

There has also been provided, in accordance with another aspect of the present invention, transformed 
plant cells that contain DNA comprised of the above-mentioned elements (a), (b) and (c). In accordance with 
yet another aspect of the present invention, differentiated oilseed crop plants are provided that have decreased 
oil content in the seeds. 

45 

DETAILED DESCRIPTION OF THE INVENTION 

The expression of a plant gene which exists in double-stranded DNA form involves transcription of mes- 
senger RNA (mRNA) from one strand of the DNA by RNA polymerase enzyme, and the subsequent processing 
so of the mRNA primary transcript inside the nucleus. This processing involves a 3' non-translated region which 
adds polyadenylate nucleotides to the 3' end of the RNA. 

Transcription of DNA into mRNA is regulated by a region of DNA usually referred to as the promoter. The 
promoter region contains a sequence of bases that signals RNA polymerase to associate with the DNA, and 
to initiate the transcription of mRNA using one of the DNA strands as a template to make a corresponding conv 
55 plimentary strand of RNA 

Promoters which are known or are found to cause transcription of DNA in plant cells can be used in the 
present invention. Such promoters may be obtained from a variety of sources such as plants and plant viruses 
and include, but are not limited to, the enhanced CaMV35S promoter and promoters isolated from plant genes 
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such as ssRUBISCO genes. As described below, it is preferred that the particular promoter selected should 
be capable of causing sufficient expression to result in the production of an effective amount of ADPGPP en- 
zyme to cause the desired decrease in oil content. Therefore, it is preferred to bring about expression of the 
ADPQPP gene in the seed tissues of the plant and throughout the seed development The promoter chosen 

5 should have the desired tissue and developmental specificity. Those skilled in the art will recognize that the 
amount of ADPGPP needed to induce the desired decrease in oil content may vary with the type of plant and 
furthermore that too much ADPGPP activity may be deleterious to the plant Therefore, promoter function 
should be optimized by selecting a promoter with the desired tissue expression capabilities and approximate 
promoter strength and selecting a transformant which produces the desired ADPGPP activity in the target Us- 

10 sues. This selection approach from the pool of transfbrmants is routinely employed in expression of heterolo- 
gous structural genes in plants since there is variation between transformants containing the same heterolo- 
gous gene due to the site of gene insertion within the plant genome. (Commonly referred to as "position effect/). 

Promoters may be identified to be seed specific by screening a cDNA library of a plant seed for genes 
which are selectively or preferably expressed in seeds and then determine the promoter regions to obtain seed 

is selective or seed enhanced promoters. It is believed that most of the enzymes involved in carbohydrate me- 
tabolism have seed-specific forms from which seed-specific promoters may be obtained. Examples of such 
enzymes are sucrose synthase, invertase, and ADPGPP (both subunits). 

Several seed-specific promoters are well known, 0-conglycinin (also known as the 7S protein) is one of 
the major storage proteins in soybean {Glycine max) (Tierney, 1987). Promoters from each of the genes for 

20 its three subunits may be used in the present invention. The 0-subunit of p-conglycinin has been expressed, 
using its endogenous promoter, in the seeds of transgenic petunia and tobacco, showing that the promoter 
functions in a seed-specific manner in other plants (Bray, 1987). Example 1 below demonstrates the use of 
the a* subunit of this promoter with an ADPGPP in canola. The gene for the 11 S storage protein of soybean is 
also known to be expressed in a seed-specific manner and its promoter may be used in the present invention. 

25 Two seed-specific promoters from Brassica napus have been identified. They are the promoter for napin 

and the promoter for cruciferin (Murphy, 1989). The promoters for the genes encoding phaseolin (from beans) 
and oleosin (from rape, soybean, and others) are also useful in the present invention. (Zheng, 1993). 

The zeins are a group of storage proteins found in maize endosperm. Genomic clones for zein genes have 
been isolated (Pedersen, 1982), and the promoters from these clones, including the 15 kD, 16 kD, 19 kD, 22 

30 kD, 27 kD, and gamma genes, could also be used to express an ADPGPP gene in the seeds of maize and 
other plants. An endosperm-specific promoter of the 19 kD zein has been identified (Quattrocento, 1990). 
Other promoters known to function in maize include the promoters for the following genes: waxy, Brittle, 
Shrunken 2, Branching enzymes I and II, starch synthases, de branching enzymes, oleoslns, glutelins, and su- 
crose synthases. 

35 Examples of promoters suitable for expression of an ADPGPP gene in wheat include those for the genes 

for the ADPGPP subunits, for the granule bound and other starch synthases, for the branching and debranching 
enzymes, for the embryogenesis-abundant proteins, for the gliadins, and for the gluten ins. Examples of such 
promoters in rice include those for the genes for the ADPGPP subunits, for the granule bound and other starch 
synthases, for the branching enzymes, for the debranching enzymes, for sucrose synthases, and for the glu- 

40 telins (Zheng, 1 993). Examples of such promoters for barley include those for the genes for the ADPGPP sub- 
units, for the granule bound and other starch synthases, for the branching enzymes, for the debranching en- 
zymes, for sucrose synthases, for the hordeins, for the embryo globulins, and the aleurone specific proteins. 

Promoters for genes encoding proteins other that for storage or carbohydrate metabolism may be found 
to be useful in the present invention. For example, the acyl carrier protein gene has a promoter known to func- 

45 tioa in a seed-specific manner. (Baerson, 1993). 

The RNA produced by a DNA construct of the present invention also contains a 5' non-translated leader 
sequence. This sequence can be derived from the promoter selected to express the gene, and can be specif- 
ically modified so as to increase translation of the mRNA. The 5' non-translated regions can also be obtained 
from viral RNAs, from suitable eukaryotic genes, or from a synthetic gene sequence. The present invention 

so is not limited to constructs, as presented in the following examples, wherein the non-translated region is de- 
rived from the 5' non-translated sequence that accompanies the promoter sequence. Rather, the non-trans- 
lated leader sequence can be derived from an unrelated promoter or coding sequence as discussed above. 

The DNA constructs of the present invention also contain a structural coding sequence in double-stranded 
DNA form, which encodes a fusion polypeptide comprising an amino- terminal piasttd transit peptide and an 

55 ADPGPP enzyme. The ADPGPP enzyme utilized in the present invention is preferably subject to reduced al- 
losteric control in plants. Such an unregulated ADPGPP enzyme may be selected from known enzymes which 
exhibit unregulated enzymatic activity or can be produced by mutagenesis of native bacterial, or algal or plant 
ADPGPP enzymes as discussed in greater detail hereinafter. In some instances, the substantial differences 
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in the nature of regulators modulating the activity of the wild type ADPGPP enzyme permits the use of the 
wild type gene itself; in these instances, the concentration of the regulators within plant organelles will facilitate 
elicitation of significant ADPGPP enzyme activity. 

5 Bacterial ADPglucose Pyrophosphorylases 

The E. co// ADPGPP has been well characterized as a tightly regulated enzyme. The activator fructose 
1 f 6-bisphosphate has been shown to activate the enzyme by increasing its V^, and by increasing the affinity 
of the enzyme for its substrates (Preiss, 1 966 and Gentner, 1967). In addition, fructose 1 ,6-bisphosphate (FBP) 
10 also modulates the sensitivity of the enzyme to the inhibitors adenosine-S'-monophosphate (AMP) and inor- 
ganic phosphate (PJ (Gentner, 1968). 

In 1981 , the E. coli K12 ADPGPP gene (g/gC), along with the genes for glycogen synthase and branching 
enzyme, were cloned, and the resulting plasmid was named pOP12 (Okita, 1981). The glgC gene, which was 
sequenced in 1983, contains 1293 bp (SEQ ID NO:1) and encodes 431 amino acids (SEQ ID NO:2) with a do- 
ts duced molecular weight of 48,762 (Baecker, 1 983). 

The g/gC16 gene was generated by chemically mutagenizing E. coli K12 strain PA 601 with N-methyt-N - 
nrtrosoguanidine (Cattaneo, 1969 and Creuzet-Sigal, 1972). When the kinetics of the g/gC1 6 ADPGPP were 
compared to the parent, it was found that the g/gC1 6 ADPGPP had a higher affinity for ADPglucose in the ab- 
sence of the activator, fructose 1,6-bisphosphate (FBP), and the concentration of FBP needed for half-maximal 
20 activation of the enzyme was decreased in g7g€16. The inhibition of the ADPGPP activity in g/gC16 by 5*-AMP 
(AMP) was also reduced. 

The DNA sequence of the g/gC16 gene is now known (SEQ ID NO:3) (Kumar, 1989). When the g/g€16 
deduced amino acid sequence (SEQ ID NO:4) was compared to the nonisogenic E. coli K-12 3000, one amino 
acid change was noted: Gly 336 to Asp (Meyer et al., 1993). 

25 A number of other ADPGPP mutants have been found in E. coli. The expression of any of these or other 
bacterial ADPGPP wild type or mutants could also be used to increase starch production in plants. E. coli K12 
strain 6047 (g/gC47) accumulates about the same amount of glycogen during stationary phase as does strain 
618 (g/gC16). Strain 6047, like 618, shows a higher apparent affinity for FBP, and more activity in the absence 
of FBP. However, the enzyme from strain 6047 is reportedly more sensitive to inhibition by AMP compared to 

30 the enzyme from strain 618 (Latil-Damotte, 1977). 

The gtgCgene from Salmonella typhimurium LT2 has also been cloned and sequenced (Leung and Preiss 
1987a). The gene encodes 431 amino adds with a deduced molecular weight of 45,580. The Salmonella ty- 
phimurium LT2 glgC gene and the same gene from E coli K-12 have 90% identity at the amino acid level and 
80% identity at the DNA level. Uke the E. coli ADPGPP, the Salmonella typhimurium 112 ADPGPP is also ac- 

35 tivated by FBP and is inhibited by AMP (Leung and Preiss 1 987b). This substantial conservation in amino acid 
sequences suggests that introduction of mutations which cause enhancement of ADPGPP activity in E. coli 
into S. typhimurium ADPGPP gene should have a similar effect on the ADPGPP enzyme of this organism. 

A number of other bacterial ADPGPPs have been characterized by their response to activators and inhib- 
itors (for review see: Preiss 1973). Like the Escherichia coli ADPGPP, the ADPGPPs from Aerobacter aero- 

40 genes, Aerobacter cloacae, Citrobacter freundii, and Escherichia aurescens are all activated by FBP and are 
inhibited by AMP. The ADPGPP from Aeromonas formicans is activated by fructose 6-phosphate or FBP, and 
is Inhibited by ADP. The Serratia marcescens ADPGPP, however, was not activated by any metabolite tested. 
The photosynthetic Rhodospirillum rubrum has an ADPGPP that is activated by pyruvate, and none of the test- 
ed compounds, including P,, AMP or ADP, inhibit the enzyme. Several algal ADPGPPs have been studied and 

45 found to have regulation similar to that found for plant ADPGPPs. Obviously, the ADPGPPs from many organ- 
isms could be used to increase starch biosynthesis and accumulation in plants. 

Plant ADPglucose Pyrophosphorylases 

so At one time, UDPglucose was thought to be the primary substrate for starch biosynthesis in plants. How-, 
ever, ADPglucose was found to be a better substrate for starch biosynthesis than UDPglucose (Recondo, 
1 961). This same report states that ADPGPP activity was found in plant material. 

A spinach leaf ADPGPP was partially purified and was shown to be activated by 3-phosphoglycerate (3- 
PGA) and inhibited by inorganic phosphate (Ghosh et al., 1966). The report by Ghosh et al. suggested that the 

55 biosynthesis of leaf starch was regulated by the level of ADPglucose. The activator, 3- PGA, is the primary prod- 
uct of C0 2 fixation in photosynthesis. During photosynthesis, the levels of 3-PGA would increase, causing ac- 
tivation of ADPGPP. At the same time, the levels of P, would decrease because of photophosphorytation, de- 
creasing the inhibition of ADPGPP. These changes would cause an increase in ADPglucose production and 
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starch biosynthesis. During darkness, 3-PGA levels would decrease, and P ( levels would increase, decreasing 
the activity of ADPGPP and, therefore, decreasing biosynthesis of ADPglucose and starch. 

The ADPGPP from spinach leaves was later purified to homogeneity and shown to contain subunits of 51 
and 54 kDa (Morell, 1987). Based on antibodies raised against the two subunits, the 51 kDa protein has honv 
5 ology with both the maize endosperm and potato tuber ADPGPPs, but not with the spinach leaf 54 kDa protein. 

The sequence of a rice endosperm ADPGPP subunit cDNA clone has been reported (Anderson, 1989a). 
The done encoded a protein of 483 amino acids. A comparison of the rice endosperm ADPGPP and the E. coli 
ADPGPP protein sequences shows about 30% identity. Also in 1 989, an almost f ulMength cDNA clone for the 
wheat endosperm ADPGPP was sequenced (Olive, 1989). The wheat endosperm ADPGPP clone has about 
10 24% identity with the E. coli ADPGPP protein sequence, while the wheat and the rice clones have 40% identity 
at the protein level. 

The maize endosperm ADPGPP has been purified and shown to have catalytic and regulatory properties 
similar to those of other plant ADPGPPs (Plaxton, 1 987). The native molecular weight of the maize endosperm 
enzyme is 230,000, and it is composed of four subunits of similar size. 
15 The native molecular weight of the potato tuber ADPGPP is reported to be 200,000, with a subunit size of 

50,000 (Sowokinos, 1982). Activity of the tuber ADPGPP is almost completely dependent on 3-PGA, and as 
with other plant ADPGPPs, is inhibited by P^ The potato tuber and leaf ADPGPPs have been demonstrated 
to be similar in physical, catalytic, and allosteric properties (Anderson, 1989b). 

20 Production of Altered ADPglucose Pyrophosphorylase Genes by Mutagenesis 

Those skilled in the art will recognize that while not absolutely required, enhanced results are to be ob- 
tained by using ADPGPP genes which are subject to reduced allosteric regulation ("deregulated") and more 
preferably not subject to significant levels of allosteric regulation ("unregulated") while maintaining adequate 

25 catalytic activity. In cells which do not normally accumulate significant quantities of starch, expression of a 
"regulated" enzyme may be sufficient In starch-accumulating cells and tissues, a "deregulated" or "unregu- 
lated" enzyme is the preferred system. The structural coding sequence for a bacterial or plant ADPGPP enzyme 
can be mutagen ized in E. coli ox another suitable host and screened for increased glycogen production as de- 
scribed for the g/o€16 gene of E. coli It should be realized that use of a gene encoding an ADPGPP enzyme 

30 which is only subject to modulators (activators/inhibitors) which are present in the selected plant at levels which 
do not significantly inhibit the catalytic activity will not require enzyme (gene) modification. These "unregulat- 
ed" or "deregulated" ADPGPP genes can then be inserted into plants as described herein to obtain transgenic 
plants having increased starch content 

For example, any ADPGPP gene can be cloned into the E coli B strain AC70R1-504 (Leung, 1986). This 

35 strain has a defective ADPGPP gene, and is derepressed five- to seven-fold for the other glycogen biosynthetic 
enzymes. The ADPGPP gene/ cDNA's can be put on a plasmid behind the E. coli glgC promoter or any other 
bacterial promoter. This construct can then be subjected to either site-directed or random mutagenesis. After 
mutagenesis, the cells would be plated on rich medium with 1% glucose. After the colonies have developed, 
the plates would be flooded with iodine solution (0.2 w/v% l 2 , 0.4 w/v% Kl in H 2 0, Creuzet-Sigal, 1972). By 

40 comparison with an identical plate containing non-mutated E. coll colonies that are producing more glycogen 
can be detected by their darker staining. 

Since the mutagenesis procedure could have created promoter mutations, any putative ADPGPP mutant 
from the first round screening will have to have the ADPGPP gene recloned into non-mutated vector and the 
resulting plasmid will be screened in the same manner. The mutants that make it though both rounds of screen- 

45 ing will then have their ADPGPP activities assayed with and without the activators and inhibitors. By comparing 
the mutated ADPGPP's responses to activators and inhibitors to the non-mutated enzymes, the new mutant 
can be characterized. 

The report by Plaxton and Preiss in 1987 demonstrates that the maize endosperm ADPGPP has regulatory 
properties similar to those of the other plant ADPGPPs. They show that earlier reports claiming that the maize 

so endosperm ADPGPP had enhanced activity in the absence of activator (3-PGA) and decreased sensitivity to 
the inhibitor (P f ). was due to proteolytic cleavage of the enzyme during the isolation procedure. By altering an 
ADPGPP gene to produce an enzyme analogous to the proteolytically cleaved maize endosperm ADPGPP, 
decreased allosteric regulation will be achieved. The recent report concerning the apparent novelty of the reg- 
ulation of the barley endosperm ADPGPP and its apparent insensitivity to 3-PGA is not generally accepted 

55 since the report shows that the enzyme preparation was rapidly degraded and may suffer from the same prob- 
lems identified for the corn endosperm preparation. 

To assay a liquid culture of E. coli for ADPGPP activity, the cells are spun down in a centrifuge and re- 
suspended in about 2 ml of extraction buffer (0.05 M glycylgtycine pH 7.0, 5.0 mM DTE, 1.0 mM EDTA) per 
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gram of cell paste. The celts are lysed by passing twice through a French Press. The eel) extracts are spun in 
a microcentrifuge for 5 minutes, and the supernatants are desalted by passing through a G-50 spin column. 

The enzyme assay for the synthesis of ADPglucose is a modification of a published procedure (Haugen, 
1976). Each 100 ul assay contains: 10 untoJe Hepes pH 7.7, 50 ug BSA, O.OSnmole of p^Jglucose-l-phos- 

5 phate, 0.1 5 umole ATP, 0.5 jimole MgC^, 0.1 ng of crystalline yeast inorganic pyrophosphatase, 1 mM ammo- 
nium molybdate, enzyme, activators or inhibitors as desired, and water. The assay is incubated at 37°C for 10 
minutes, and is stopped by boiling for 60 seconds. The assay is spun down in a microcentrifuge, and 40 ul of 
the supernatant is injected onto a Synchrom Synchropak AX- 100 anion exchange HPLC column. The sample 
is eluted with 65 mM KPi pH 5.5. Unreacted [ M C]glucose-1 -phosphate etutes around 7-8 minutes, and 

10 [ 14 C]ADPgIucose elutes at approximately 13 minutes. Enzyme activity is determined by the amount of radio- 
activity found in the ADPglucose peak. 

The plant ADPGPP enzyme activity is tightly regulated, by both positive (3-phosphoglycerate; 3-PGA) and 
negative effectors (inorganic phosphate; PJ (Ghosh and Preiss, 1966; Copeland and Preiss 1981; Sowokinos 
and Preiss 1982; Morell et al., 1987; Plaxton and Preiss, 1987; Preiss, 1988;) and the ratio of 3-PGA:P, plays 

15 a prominent role in regulating starch biosynthesis by modulating the ADPGPP activity (Kaiser and Bassham, 
1979). The plant ADPGPP enzymes are heterotetramers of two largeTshrunken" and two small/"Brittle" sub- 
units (Morell et al., 1987; Lin et aJ., 1988a, 1988b; Kris h nan et al., 1986; Okita et al., 1990) and there is strong 
evidence to suggest that the heterotetramer is the most active form of ADPGPP. Support for this suggestion 
comes from the isolation of plant "starchless" mutants that are deficient in either of the subunits (Dickinson 

20 and Preiss, 1969; Lin et at., 1988a, 1988b) and from the characterization of an "ADPGPP" homotetramer of 
small subunits that was found to have only low enzyme activity (Lin et al., 1 988b). In addition, proposed effector 
interaction residues have been identified for both subunits (Morell et al., 1988). Direct evidence for the active 
form of the enzyme and further support of the kinetic data reported for the purified potato enzyme comes from 
the expression of potato ADPGPP activity in E. coli and the comparison of the kinetic properties of this material 

25 and that from potato tubers (Iglesias et al., 1993). 

Unregulated enzyme variants of the plant ADPGPP are identified and characterized in a manner similar 
to that which resulted in the isolation of the E. coli g!gC16 and related mutants. A number of plant ADPGPP 
cDNA's, or portions of such cDNA's, for both the large and small subunits, have been cloned from both mono- 
cots and dicots (Anderson et al., 1989a; Olive et al., 1989; Muller et al., 1990) The proteins encoded by the 

30 plant cDNA's, as well as those described from bacteria, show a high degree of conservation (Bhave et al., 
1 990). In particular, a highly conserved region, also containing some of the residues implicated in enzyme func- 
tion and effector interactions, has been identified (Morell et al., 1988; Smith-White and Preiss, 1992). Clones 
of the potato tuber ADPGPP subunit genes have been isolated. These include a complete small subunit gene, 
assembled by addition of sequences from the first exon of the genomic clone with a nearly full-length cDNA 

35 clone of the same gene, and an almost complete gene for the large subunit The nucleotide sequence (SEQ 
ID NO:7) and the amino acid sequence (SEQ ID NO:8) of the assembled small subunit gene are given below. 
The nucleotide sequence presented here differs from the gene originally isolated in the following ways: a 
BglM+Nco I site was introduced at the ATG codon to facilitate the cloning of the gene into E. coli and plant ex- 
pression vectors by site directed mutagenesis utilizing the oligonucleotide primer sequence 

40 

GTTGATAACAAGATCTGTTAACCATGGCGGCTTCC (SEQ ID 
NO:ll). 

45 A Sacl site was introduced at the stop codon utilizing the oligonucleotide primer sequence 
CCAGTTAAAACGGAGCTCATCAGATGATGATTC (SEQ ID NO:12). 

The Sacl site serves as a 3* cloning site. An internal Bgl\\ site was removed utilizing the oligonucleotide primer 
50 sequence 

GTGTGAGAACATAAATCTTGGATATGTTAC (SEQ ID NO: 13). 

This assembled gene was expressed in E. coli under the control of the rocA promoter in a PrecA-gene10L ex- 
55 pression cassette (Wong et al., 1 988) to produce measurable levels of the protein. An initiating methionine co- 
don is placed by site-directed mutagenesis utilizing the oligonucleotide primer sequence 

GAATTCACAGGGCCATGGCTCTAGACCC (SEQ ID NO:14) 
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to express the mature gene. 

The nucleotide sequence (SEQ ID NO:9) and the amino acid sequence (SEQ ID NO: 10) of the atmost com- 
plete large subunit gene are given below. An initiating methionine codon has been placed at the mature N- 
terminus by site-directed mutagenesis utilizing the oligonucleotide primer sequence 

5 

AAGATCAAACCTGCCATGGCTTACTCTGTGATCACTACTG (SEQ 
IDNO:15). 

10 The purpose of the initiating methionine is to facilitate the expression of this large subunit gene in E. coll A 
HindlM site is located 103 bp after the stop codon and serves as the 3* cloning site. The complete large ADPGPP 
gene is isolated by the 5* RACE procedure (Rapid Amplification of cDNA Ends; Frohman, 1990; Loh, 1989). 
The oligonucleotide primers for this procedure are as follows: 

15 

1) GGGAATTCAAGCTTGGATCCCGGGCCCCCCCCCCCCCCC 
(SEQIDNO:16); 

2) GGGAATTCAAGCTTGGATCCCGGG (SEQ ID NO:17); and 

20 3) CCTCTAGACAGTCGATCAGGAGCAGATGTACG (SEQ ID NO:18). 

The first two are the equivalent to the ANpolyC and the AN primers of Loh et aJ. (1989), respectively, and the 
third is the reverse complement to a sequence in the large ADPGPP gene, located after the Pst I site in SEQ 

25 ID NO:9. The PCR 5' sequence products are cloned as EcoRI/H/ndlll/BamHI-Psfl fragments and are easily 
assembled with the existing gene portion. 

The weakly regulated enzyme mutants of ADPGPP are identified by initially scoring colonies from a mu- 
tagen ized E. co// culture that show elevated glycogen synthesis, by iodine staining of 24-48 hour colonies on 
Luria-Agar plates containing glucose at 1%, and then by characterizing the responses of the ADPGPP enzymes 

30 from these isolates to the positive and negative effectors of this activity (Cattaneo et al., 1969; Preiss et aL, 
1971). A similar approach is applied to the isolation of such variants of the plant ADPGPP enzymes. Given an 
expression system for each of the subunit genes, mutagenesis of each gene is carried out separately, by any 
of a variety of known means, both chemical or physical (Miller, 1972) on cultures containing the gene or on 
purified DNA. Another approach is to use a PCR procedure (Ehrlich, 1989) on the complete gene in the pres- 

35 ence of inhibiting Mn ++ ions, a condition that leads to a high rate of misincorporation of nucleotides. A PCR 
procedure may also be used with primers adjacent to just a specific region of the gene, and this mutagenized 
fragment then recloned into the non-mutagenized gene segments. A random synthetic oligo-nucleotide pro- 
cedure may also be used to generate a highly mutagenized short region of the gene by mixing of nucleotides 
in the synthesis reaction to result in misincorporation at all positions in this region. This small region is flanked 

40 by restriction sites that are used to reinsert this region into the remainder of the gene. The resultant cultures 
or transformants are screened by the standard iodine method for those exhibiting glycogen levels higher than 
controls. Preferably this screening is carried out in an E. coli strain deficient only in ADPGPP activity and is 
phenotypically glycogen-minus and that is complemented to glycogen-plus by g/gC. The E. coU strain should 
retain those other activities required for glycogen production. Both genes are expressed together in the same 

45 E. coli host by placing the genes on compatible plasmids with different selectable marker genes, and these 
plasmids also have similar copy numbers in the bacterial host to maximize heterotetramer formation. An ex- 
ample of such an expression system is the combination of pMON1 7335 and pMON 17336 (Iglesias et al., 1993). 
The use of separate plasmids enables the screening of a mutagenized population of one gene alone, or in con- 
junction with the second gene following transformation into a competent host expressing the other gene, and 

50 the screening of two mutagenized populations following the combining of these in the same host Following 
re-isolation of the plasmid DNA from colonies with increased iodine staining, the ADPGPP coding sequences 
are recloned into expression vectors, the phenotype verified, and the ADPGPP activity and its response to 
the effector molecules determined. Improved variants will display increased V^, reduced inhibition by the neg- 
ative effector (PJ, or reduced dependence upon activator (3-PGA) for maximal activity. The assay for such 

55 improved characteristics involves the determination of ADPGPP activity in the presence of P, at 0.045 mM (\ 0 & 
= 0.045 mM) or in the presence of 3-PGA at 0.075 mM (A0.5 = 0.075 mM). The useful variants will display <40% 
inhibition at this concentration of P| or display >50% activity at this concentration of 3-PGA. Following the iso- 
lation of improved variants and the determination of the subunit or su bun its responsible, the mutation(s) are 
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determined by nucleotide sequencing. The mutation is confirmed by recreating this change by site-directed 
mutagenesis and reassay of ADPGPP activity in the presence of activator and inhibitor. This mutation is then 
transferred to the equivalent complete ADPGPP cDNA gene, by recloning the region containing the change 
from the altered bacterial expression form to the plant form containing the amyloplast targeting sequence, or 
5 by site-directed mutagenesis of the complete native ADPGPP plant gene. 

Chloroplast/Amyloplast Directed Expression of ADPGPP Activity 

Starch biosynthesis is known to take place in plant chloroplasts and amyloplasts (herein collectively re- 
10 ferred to as plastids. In the plants that have been studied, the ADPGPP is localized to these plastids. Many 
chloroplast-localized proteins are expressed from nuclear genes as precursors and are targeted to the chJor- 
oplast by a chloroplast transit peptide (CTP) that is removed during the import steps. Examples of such chlor- 
oplast proteins include the small subunit of Ribu1ose-1,5-bisphosphate carboxylase (ssRUBISCO, SSU), 5-en- 
olpyruvateshikimate-3-phosphate synthase (EPSPS), Ferredoxin, Ferredoxin oxidoreductase, the Light- 
15 harvesting-complex protein I and protein II, and Thioredoxin F. It has been demonstrated in vivo and in vitro 
that non-chloroplast proteins may be targeted to the chloroplast by use of protein fusions with a CTP and that 
a CTP sequence is sufficient to target a protein to the chloroplast Likewise, amylbplast-localized proteins are 
expressed from nuclear genes as precursors and are targeted to the amyloplast by an amyloplast transit pep- 
tide (ATP). 

20 In the exemplary embodiments, a specialized CTP, derived from the ssRUBISC0 1 A gene from Arabidop- 

sis thaliana (SSU 1A) (Timko, 1988) was used. This CTP (CTPt) was constructed by a combination of site- 
directed mutageneses. The CTP1 nucleotide sequence (SEQ ID NO:5) and the corresponding amino acid se- 
quence (SEQ ID NO;6) are given below. CTP1 is made up of the SSU 1 A CTP (amino acid 1-55), the first 23 
amino acids of the mature SSU 1 A protein (58-78), a serine residue (amino acid 79), a new segment that re- 

25 peats amino acids 50 to 56 from the CTP and the first two from the mature protein (amino acids 80-87), and 
an alanine and methionine residue (amino acid 88 and 89). An A/col restriction site is located at the 3' end 
(spans the Met codon) to facilitate the construction of precise fusions to the 5' of an ADPGPP gene. At a later 
stage, a Sg/ll site was introduced upstream of the N-terminus of the SSU 1 A sequences to facilitate the intro- 
duction of the fusions into plant transformation vectors. A fusion was assembled between the structural DNA 

30 encoding the CTP1 CTP and the g!gC16 gene from E. coii to produce a complete structural DNA sequence 
encoding the plastid transit peptide/ ADPGPP fusion polypeptide. 

Those skilled in the art will recognize that if either a single plant ADPGPP cDNA encoding shrunken and/or 
brittle subunits or both plant ADPGPP cDNA's encoding shrunken and brittle subunits is utilized in the practice 
of the present invention, the endogenous CTP or ATP could most easily and preferably be used. Hence, for 

35 purposes of the present invention the term "plastid transit peptides" should be interpreted to include both chlor- 
oplast transit peptides and amyloplast transit peptides. Those skilled in the art will also recognize that various 
other chimeric constructs can be made which utilize the functionality of a particular plastid transit peptide to 
import the contiguous ADPGPP enzyme into the plant cell chloroplast/amyloplast depending on the promoter 
tissue specificity. 

40 

Polyadenytation Signal 

The 3' non-translated region of the chimeric plant gene contains a polyadenylatlon signal which functions 
in plants to cause the addition of potyadenylate nucleotides to the 3* end of the RNA. Examples of suitable 
45 3* regions are (1) the 3' transcribed, non-translated regions containing the polyadenylated signal of Agrobac- 
terium the tumor : inducing (Ti) plasmid genes, such as the nopaline synthase (NOS) gene, and (2) plant genes 
like the soybean storage protein genes and the small subunit of the ribulose- 1,5- bis phosphate carboxylase 
(ssRUBISCO) gene. An example of a preferred 3' region is that from the NOS gene, described in greater detail 
in the examples below. 

50 

Plant Transformation/Regeneration 

Plants which can be made to have decreased oil content by practice of the present invention include, but 
are not limited to, corn, wheat, rice, pea, peanut, canola/oilseed rape, cotton, barley, sorghum, soybean, sun- 
55 flower, almond, cashew, pecan, and walnut 

A double-stranded DNA molecule of the present invention containing the functional plant ADPGPP gene 
can be inserted into the genome of a plant by any suitable method. Suitable plant transformation vectors include 
those derived from a Tl plasmid of Agrobacterium tumefadens, as well as those disclosed, e.g., by Herrera- 
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Estrella (1983), Bevan (1983), Klee (1985) and EPO publication 120,516 (Schi)peroort et at.). In addition to 
plant transformation vectors derived from the Ti or root-inducing (Ri) plasmids of Agrobacterium, alternative 
methods can be used to insert the DNA constructs of this invention into plant cells. Such methods may involve, 
for example, the use of liposomes, elect roporat ion, chemicals that increase free DNA uptake, free DNA delivery 
5 via microprojectile bombardment, and transformation using viruses or pollen. 

Examples of vectors designed for the expression of g/gC16 and other ADPGPP genes in monocots and 
dicots are reported by Kishore in WO 91/19806. These are used to transform the desired plant cells by the 
appropriate method. 

When adequate numbers of cells (or protoplasts) containing the ADPGPP gene or cDNAare obtained, the 
10 cells (or protoplasts) are regenerated into whole plants. Choice of methodology for the regeneration step is 
not critical, with suitable protocols being available for hosts from Leguminosae (alfalfa, soybean, clover, eta), 
Cruciferae (cabbage, radish, canola/rapeseed, etc.), Gramtneae (wheat, barley, rice, corn, etc.), various floral 
crops, such as sunflower, and nut-bearing trees, such as almonds, cashews, walnuts, and pecans. See, e.g., 
Ammirato, 1984; Shimamoto, 1989; Fromm, 1990; Vasil, 1990; Hayashimoto, 1989; and Datta, 1990. 
is The following examples are provided to better elucidate the practice of the present invention and should 
not be interpreted in any way to limit the scope of the present invention. Those skilled in the art will recognize 
that various modifications, truncations, etc. can be made to the methods and genes described herein while 
not departing from the spirit and scope of the present invention. 

20 Example 1 

To express the E. coIiglgC\6 gene in plant cells, and to target the enzyme to the plastids, the gene needed 
to be fused to a DNA encoding the plastid- targeting transit peptide (hereinafter referred to as the CTP/ADPGPP 
gene), and to the proper plant regulatory regions. Detailed examples of how to accomplish this may be found 
25 in WO 91/19806. 

The CTP-g7gC1 6 gene fusion was placed behind the soybean 0-congtycinin 7S storage promoter described 
above. This cassette was cloned into pMON 17227, a Ti plasmid vector disclosed and described by Barry et 
al. in WO 92/04449 (1 991 ), to form the vector pMON1 731 5. This vector was used to transform canola by Agro- 
bacterium transformation followed by ghyphosate selection. Regenerated plants were analyzed and the pres- 

30 ence of the enzyme in most transfbrmants was confirmed by Western blot analysis. Seeds from four trans- 
formed lines have been obtained and analyzed for oil, starch, and protein content and moisture. The starch 
content was found to have increased to 8.2-18.2 percent (based on fresh weight) as compared to 0.9-1 .6 per- 
cent in control lines (transformed with pMON47227 only). The oil content was found to have been decreased 
from 26.7-31 .6 percent in the controls to 1 3.0-15.5 percent in the transformed lines. Protein content and mois- 

35 ture were not significantly changed. In some lines seed weight was increased which may indicate that total 
yield may also be increased. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) APPLICANT: 

(A) NAME: Monsanto Company 

(B) STREET: 800 North Lindbergh Boulevard 

(C) CITY: St. Louis 

(D) STATE: Missouri 

(E) COUNTRY: United States of America 

(F) POSTAL CODE (ZIP) : 63167 

(G) TELEPHONE: (314)694-3131 

(H) TELEFAX: (314)694-5435 

15 

(ii) TITLE OF INVENTION: Modified Oil Content in Seeds 
(iii) NUMBER OF SEQUENCES: 18 



20 (iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patent In Release #1.0, Version #1.25 (EPO) 

25 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/090523 

(B) FILING DATE: 12-JUL-1993 



(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/709663 

(B) FILING DATE: 07-JUN-1991 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/539763 

(B) FILING DATE: 18-JUN-1990 



(2) INFORMATION FOR SEQ ID NO:l: 



40 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1296 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

45 

(ii) MOLECULE TYPE: DNA (genomic) 



(x) PUBLICATION INFORMATION: 

(H) DOCUMENT NUMBER: EP 0536293 Al 

(I) FILING DATE: 07-JUN-1991 

(J) PUBLICATION DATE: 14-APR-1993 

(K) RELEVANT RESIDUES IN SEQ ID NO: 1: FROM 1 TO 1296 



55 
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(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..1293 



(xi) SEQUENCE DESCRIPTION: SEQ ID NOtlr 

ATG GTT AGT TTA GAG AAG AAC GAT CAC TTA ATG TTG GCG CGC CAG CTG 48 
Met Val Ser Leu Glu Lys Asn Asp His Leu Met Leu Ala Arg Gin Leu 
15 10 15 

CCA TTG AAA TCT GTT GCC CTG ATA CTG GCG GGA GGA CGT GGT ACC CGC 96 
Pro Leu Lys Ser Val Ala Leu lie Leu Ala Gly Gly Arg Gly Thr Arg 
20 25 30 

CTG AAG GAT TTA ACC AAT AAG CGA GCA AAA CCG GCC GTA CAC TTC GGC 144 
Leu Lys Asp Leu Thr Asn Lys Arg Ala Lys Pro Ala Val His Phe Gly 
35 40 45 

20 GGT AAG TTC CGC ATT ATC GAC TTT GCG CTG TCT AAC TGC ATC AAC TCC 192 

Gly Lys Phe Arg lie lie Asp Phe Ala Leu Ser Asn Cys lie Asn Ser 
50 55 60 



10 



15 



25 



30 



GGG ATC CGT CGT ATG GGC GTG ATC ACC CAG TAC CAG TCC CAC ACT CTG 240 
Gly lie Arg Arg Met Gly Val lie Thr Gin Tyr Gin Ser His Thr Leu 
65 70 75 80 

GTG CAG CAC ATT CAG CGC GGC TGG TCA TTC TTC AAT GAA GAA ATG AAC 288 
Val Gin His lie Gin Arg Gly Trp Ser Phe Phe Asn Glu Glu Met Asn 
85 90 95 

GAG TTT GTC GAT CTG CTG CCA GCA CAG CAG AGA ATG AAA GGG GAA AAC 336 
Glu Phe Val Asp Leu Leu Pro Ala Gin Gin Arg Met Lys Gly Glu Asn 
100 105 110 

35 TGG TAT CGC GGC ACC GCA GAT GCG GTC ACC CAA AAC CTC GAC ATT ATC 384 

Trp Tyr Arg Gly Thr Ala Asp Ala Val Thr Gin Asn Leu Asp lie lie 
115 120 125 

CGT CGT TAT AAA GCG GAA TAC GTG GTG ATC CTG GCG GGC GAC CAT ATC 432 
40 Arg Arg Tyr Lys Ala Glu Tyr Val Val lie Leu Ala Gly Asp His lie 
130 135 140 

TAC AAG CAA GAC TAC TCG CGT ATG CTT ATC GAT CAC GTC GAA AAA GGT 480 
Tyr Lys Gin Asp Tyr Ser Arg Met Leu lie Asp His Val Glu Lys Gly 
145 150 155 160 



45 



50 



55 



GTA CGT TGT ACC GTT GTT TGT ATG CCA GTA CCG ATT GAA GAA GCC TCC 528 
Val Arg Cys Thr Val Val Cys Met Pro Val Pro lie Glu Glu Ala Ser 
165 170 175 

GCA TTT GGC GTT ATG GCG GTT GAT GAG AAC GAT AAA ACT ATC GAA TTC 576 
Ala Phe Gly Val Met Ala Val Asp Glu Asn Asp Lys Thr lie Glu Phe 
180 185 190 

GTG GAA AAA CCT GCT AAC CCG CCG TCA ATG CCG AAC GAT CCG AGC AAA 624 
Val Glu Lys Pro Ala Asn Pro Pro ser Met Pro Asn Asp Pro Ser Lys 
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195 200 205 

TCT CTG GCG AGT ATG GGT ATC TAG GTC TTT GAC GCC GAG TAT CTG TAT 672 
5 Ser Leu Ala Ser Met Gly lie Tyr Val Phe Aep Ala Asp Tyr Leu Tyr 
210 215 220 

GAA CTG CTG GAA GAA GAC GAT CGC GAT GAG AAC TCC AGC CAC GAC TTT 720 
Glu Leu Leu Glu Glu Asp Asp Arg Asp Glu Asn Ser Ser His Asp Phe 
10 225 230 235 240 

GGC AAA GAT TTG ATT CCC AAG ATC ACC GAA GCC GGT CTG GCC TAT GCG 768 
Gly Lys Asp Leu lie Pro Lys lie Thr Glu Ala Gly Leu Ala Tyr Ala 
245 250 255 



15 



20 



35 



40 



CAC CCG TTC CCG CTC TCT TGC GTA CAA TCC GAC CCG GAT GCC GAG CCG 816 
His Pro Phe Pro Leu Ser Cys Val Gin Ser Asp Pro Asp Ala Glu Pro 
260 265 270 

TAC TGG CGC GAT GTG GGT ACG CTG GAA GCT TAC TGG AAA GCG AAC CTC 864 
Tyr Trp Arg Asp Val Gly Thr Leu Glu Ala Tyr Trp Lys Ala Asn Leu 
275 280 285 



GAT CTG GCC TCT GTG GTG CCG GAG CTG GAT ATG TAC GAT CGC AAT TGG 912 
Asp Leu Ala Ser Val Val Pro Glu Leu Asp Met Tyr Asp Arg Asn Trp 
25 290 295 300 

CCA ATT CGC ACC TAC AAT GAA TCA TTA CCG CCA GCG AAA TTC GTG CAG 960 
Pro lie Arg Thr Tyr Asn Glu Ser Leu Pro Pro Ala Lys Phe Val Gin 
305 310 315 320 

30 

GAT CGC TCC GGT AGC CAC GGG ATG ACC CTT AAC TCA CTG GTT TCC GGC 1008 
Asp Arg Ser Gly Ser His Gly Met Thr Leu Asn Ser Leu Val Ser Gly 
325 330 335 

GGT TGT GTG ATC TCC GGT TCG GTG GTG GTG CAG TCC GTT CTG TTC TCG 1056 
Gly Cys Val He Ser Gly Ser Val Val Val Gin Ser Val Leu Phe Ser 
340 345 350 

CGC GTT CGC GTG AAT TCA TTC TGC AAC ATT GAT TCC GCC GTA TTG TTA 1104 
Arg Val Arg Val Asn Ser Phe Cys Asn lie Asp Ser Ala Val Leu Leu 
355 360 365 

CCG GAA GTA TGG GTA GGT CGC TCG TGC CGT CTG CGC CGC TGC GTC ATC 1152 
Pro Glu Val Trp Val Gly Arg Ser Cys Arg Leu Arg Arg Cys Val lie 
370 375 380 

45 GAT CGT . GCT TGT GTT ATT CCG GAA GGC ATG GTG ATT GGT GAA AAC GCA 1200 
Asp Arg Ala Cys Val He Pro Glu Gly Met Val tie Gly Glu Asn Ala 
385 390 395 400 

GAG GAA GAT GCA CGT CGT TTC TAT CGT TCA GAA GAA GGC ATC GTG CTG 1248 
so Glu Glu Asp Ala Arg Arg Phe Tyr Arg Ser Glu Glu Gly He Val Leu 

405 410 415 

GTA ACG CGC GAA ATG CTA CGG AAG TTA GGG CAT AAA CAG GAG CGA TAA 1296 
Val Thr Arg Glu Met Leu Arg Lys Leu Gly His Lys Gin Glu Arg 
55 420 425 430 
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10 



15 



20 



25 



30 



40 



45 



50 



55 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 431 amino acids 

(B) TYPE j amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(x) PUBLICATION INFORMATION: 

(H) DOCUMENT NUMBER: EP 0536293 Al 

<I) FILING DATE: 07-JUN-1991 

(J) PUBLICATION DATE: 14-APR-1993 

(K) RELEVANT RESIDUES IN SEQ ID NO: 2: FROM 1 TO 431 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Val Ser Leu Glu Lys Asn Asp Hie Leu Met Leu Ala Arg Gin Leu 
1 5 io I5 

Pro Leu Lya Ser Val Ala Leu He Leu Ala Gly Gly Arg Gly Thr Arg 
20 25 30 

Leu Lye Asp Leu Thr Asn Lys Arg Ala Lys Pro Ala Val His Phe Gly 
35 40 45 

Gly Lys Phe Arg He He Asp Phe Ala Leu Ser Asn Cys He Asn Ser 
50 55 60 

Gly He Arg Arg Met Gly Val He Thr Gin Tyr Gin Ser His Thr Leu 
65 70 75 80 

Val Gin His He Gin Arg Gly Trp Ser Phe Phe Asn Glu Glu Met Asn 
35 85 90 95 

Glu Phe Val Asp Leu Leu Pro Ala Gin Gin Arg Met Lys Gly Glu Asn 
100 105 no 



Trp Tyr Arg Gly Thr Ala Asp Ala Val Thr Gin Asn Leu Asp He He 
115 120 125 

Arg Arg Tyr Lys Ala Glu Tyr Val Val He Leu Ala Gly Asp His He 
130 135 140 

Tyr Lys Gin Asp Tyr Ser Arg. Met Leu He Asp His Val Glu Lys Glv 
145 150 155 xe l 

Val Arg Cys Thr Val Val Cys Met Pro Val Pro He Glu Glu Ala Ser 
165 170 175 

Ala Phe Gly Val Met Ala Val Asp Glu Asn Asp Lye Thr He Glu Phe 
180 185 190 



Val Glu Lys Pro Ala Asn Pro Pro Ser Met Pro Asn Asp Pro Ser 



Lys 
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195 200 205 

Ser Leu Ala Ser Met Gly lie Tyr Val Phe Asp Ala Asp Tyr Leu Tyr 
5 210 215 220 

Glu Leu Leu Glu Glu Asp Asp Arg Asp Glu Asn Ser Ser His Asp Phe 
225 230 235 240 

10 Gly Lys Asp Leu lie Pro Lye lie Thr Glu Ala Gly Leu Ala Tyr Ala 

245 250 255 

His Pro Phe Pro Leu Ser Cys Val Gin Ser Asp Pro Asp Ala Glu Pro 
260 265 270 

15 

Tyr Trp Arg Asp Val Gly Thr Leu Glu Ala Tyr Trp Lys Ala Asn Leu 
275 280 285 

Asp Leu Ala Ser Val Val Pro Glu Leu Asp Met Tyr Asp Arg Asn Trp 
290 295 300 

20 

Pro lie Arg Thr Tyr Asn Glu Ser Leu Pro Pro Ala Lys Phe Val Gin 
305 310 315 320 

Asp Arg Ser Gly Ser His Gly Met Thr Leu Asn Ser Leu Val Ser Gly 
25 325 330 335 

Gly Cys Val He Ser Gly Ser Val Val Val Gin Ser Val Leu Phe Ser 
340 345 350 



30 



35 



40 



Arg Val Arg Val Asn Ser Phe Cys Asn He Asp Ser Ala Val Leu Leu 
355 360 365 

Pro Glu Val Trp Val Gly Arg Ser Cys Arg Leu Arg Arg Cys Val He 
370 375 380 

Asp Arg Ala Cys Val He Pro Glu Gly Met Val He Gly Glu Asn Ala 
385 390 395 400 

Glu Glu Asp Ala Arg Arg Phe Tyr Arg Ser Glu Glu Gly He Val Leu 
405 410 415 

Val Thr Arg Glu Met Leu Arg Lys Leu Gly His Lys Gin Glu Arg 
420 425 430 



45 (2) INFORMATION FOR SEQ ID NO: 3: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1296 base pairs 

(B) TYPE: nucleic acid 

SO (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA (genomic) 
55 (ix) FEATURE: 



15 
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10 



15 



30 



35 



40 



<A) NAME/KEY : CDS 

(B) LOCATION : 1..1293 



(x) PUBLICATION INFORMATION: 

<H) DOCUMENT NUMBER: EP 0536293 Al 

(I) FILING DATE: 07-JUN-1991 

(J) PUBLICATION DATE: 14-APR-1993 

(K) RELEVANT RESIDUES IN SEQ ID NO: 3: FROM 1 TO 1296 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 

ATG GTT AGT TTA GAG AAG AAC GAT CAC TTA ATG TTG GCG CGC CAG CTG 48 
Met Val Ser Leu Glu Lys Asn Asp His Leu Met Leu Ala Arg Gin Leu 
15 10 15 

CCA TTG AAA TCT GTT GCC CTG ATA CTG GCG GGA GGA CGT GGT ACC CGC 96 
Pro Leu Lya Ser Val Ala Leu lie Leu Ala Gly Gly Arg Gly Thr Arg 
20 20 25 30 

CTG AAG GAT TTA ACC AAT AAG CGA GCA AAA CCG GCC GTA CAC TTC GGC 144 

Leu Lye Asp Leu Thr Asn Lys Arg Ala Lys Pro Ala Val His Phe Gly 

35 40 45 

25 

GGT AAG TTC CGC ATT ATC GAC TTT GCG CTG TCT AAC TGC ATC AAC TCC 192 

Gly Lys Phe Arg lie lie Asp Phe Ala Leu Ser Asn Cys lie Asn Ser 

50 55 60 

GGG ATC CGT CGT ATG GGC GTG ATC ACC CAG TAC CAG TCC CAC ACT CTG 240 
Gly He Arg Arg Met Gly Val He Thr Gin Tyr Gin Ser His Thr Leu 
65 70 75 80 

GTG CAG CAC ATT CAG CGC GGC TGG TCA TTC TTC AAT GAA GAA ATG AAC 288 
Val Gin His He Gin Arg Gly Trp Ser Phe Phe Asn Glu Glu Met Asn 
85 90 95 

GAG TTT GTC GAT CTG CTG CCA GCA CAG CAG AGA ATG AAA GGG GAA AAC 336 
Glu Phe Val Asp Leu Leu Pro Ala Gin Gin Arg Met Lys Gly Glu Asn 
100 105 HO 

TGG TAT CGC GGC ACC GCA GAT GCG GTC ACC CAA AAC CTC GAC ATT ATC 384 
Trp Tyr Arg Gly Thr Ala Asp Ala Val Thr Gin Asn Leu Asp He He 
115 120 125 

45 CGT CGT TAT AAA GCG GAA TAC GTG GTG ATC CTG GCG GGC GAC CAT ATC 432 
Arg Arg Tyr Lys Ala Glu Tyr Val Val He Leu Ala Gly Asp His He 
130 135 140 

TAC AAG CAA GAC TAC TCG CGT ATG CTT ATC GAT CAC GTC GAA AAA GGT 480 
& Tyr Lys Gin Asp Tyr Ser Arg Met Leu He Asp HiB Val Glu Lys Gly 
145 150 155 160 

GTA CGT TGT ACC GTT GTT TGT ATG CCA GTA CCG ATT GAA GAA GCC TCC 528 
Val Arg Cys Thr Val Val Cys Met Pro Val Pro He Glu Glu Ala Ser 
55 165 I 70 175 
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GCA TTT GGC GTT ATG GCG GTT GAT GAG AAC GAT AAA ACT ATC GAA TTC 576 
Ala Phe Gly Val Mot Ala Val Asp Glu Aon Aep Lys Thr He Glu Phe 
180 185 190 

GTG GAA AAA CCT GCT AAC CCG CCG TCA ATG CCG AAC GAT CCG AGC AAA 624 
Val Glu Lys Pro Ala Asn Pro Pro Ser Met Pro Asn Asp Pro Ser Lye 
195 200 205 

TCT CTG GCG AGT ATG GGT ATC TAC GTC TTT GAC GCC GAC TAT CTG TAT 672 
Ser Leu Ala Ser Met Gly He Tyr Val Phe Asp Ala Aep Tyr Leu Tyr 
210 215 220 

GAA CTG CTG GAA GAA GAC GAT CGC GAT GAG AAC TCC AGC CAC GAC TTT 720 
Glu Leu Leu Glu Glu Aep Aep Arg Asp Glu Asn Ser Ser His Asp Phe 
225 230 235 240 

GGC AAA GAT TTG ATT CCC AAG ATC ACC GAA GCC GGT CTG GCC TAT GCG 768 
Gly Lys Asp Leu He Pro Lys He Thr Glu Ala Gly Leu Ala Tyr Ala 
245 250 255 

CAC CCG TTC CCG CTC TCT TGC GTA CAA TCC GAC CCG GAT GCC GAG CCG 816 
His Pro Phe Pro Leu Ser Cys Val Gin Ser Asp Pro Asp Ala Glu Pro 
260 265 270 

25 TAC TGG CGC GAT GTG GGT ACG CTG GAA GCT TAC TGG AAA GCG AAC CTC 864 

Tyr Trp Arg Asp Val Gly Thr Leu Glu Ala Tyr Trp Lys Ala Asn Leu 
275 280 285 



10 



15 



20 



GAT CTG GCC TCT GTG GTG CCG GAG CTG GAT ATG TAC GAT CGC AAT TGG 912 
Asp Leu Ala Ser Val Val Pro Glu Leu Asp Met Tyr Asp Arg Asn Trp 
290 295 300 

CCA ATT CGC ACC TAC AAT GAA TCA TTA CCG CCA GCG AAA TTC GTG CAG 960 
Pro He Arg Thr Tyr Asn Glu Ser Leu Pro Pro Ala Lys Phe Val Gin 
305 310 315 320 

GAT CGC TCC GGT AGC CAC GGG ATG ACC CTT AAC TCA CTG GTT TCC GAC 1008 
Asp Arg Ser Gly Ser His Gly Met Thr Leu Asn Ser Leu Val Ser Asp 
325 330 335 

GGT TGT GTG ATC TCC GGT TCG GTG GTG GTG CAG TCC GTT CTG TTC TCG 1056 
Gly Cys Val He Ser Gly Ser Val Val Val Gin Ser Val Leu Phe Ser 
340 345 350 

CGC GTT CGC GTG AAT TCA TTC TGC AAC ATT GAT TCC GCC GTA TTG TTA 1104 
45 Arg Val Arg Val Asn Ser Phe Cys Asn He Asp Ser Ala Val Leu Leu 
355 360 365 

CCG GAA GTA TGG GTA GGT CGC TCG TGC CGT CTG CGC CGC TGC GTC ATC 1152 
Pro Glu Val Trp Val Gly Arg Ser Cys Arg Leu Arg Arg Cys Val He 
370 375 380 



30 



35 



40 



50 



55 



GAT CGT GCT TGT GTT ATT CCG GAA GGC ATG GTG ATT GGT GAA AAC GCA 1200 
Asp Arg Ala Cys Val He Pro Glu Gly Met Val He Gly Glu Asn Ala 
3 85 390 395 400 

GAG GAA GAT GCA CGT CGT TTC TAT CGT TCA GAA GAA GGC ATC GTG CTG 1248 
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Glu Glu Asp Ala Arg Arg Phe Tyr Arg Ser Glu Glu Gly He Val Leu 
405 410 415 

5 GTA ACG CGC GAA ATG CTA OGG AAG TTA GGG CAT AAA CAG GAG CGA TAA 1296 

Val Thr Arg Glu Met Leu Arg Lye Leu Gly His Lys Gin Glu Arg 
420 425 430 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 431 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(x) PUBLICATION INFORMATION : 
20 (H) DOCUMENT NUMBER: EP 0536293 Al 

(I) FILING DATE: 07-JUN-1991 
(J) PUBLICATION DATE: 14-APR-1993 

(K) RELEVANT RESIDUES IN SEQ ID NO: 4: FROM 1 TO 431 

25 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Val Ser Leu Glu Lys Aon Asp His Leu Met Leu Ala Arg Gin Leu 
1 5 10 is 



Pro Leu Lys Ser Val Ala Leu He Leu Ala Gly Gly Arg Gly Thr Arg 
20 25 30 

Leu Lys Asp Leu Thr Asn Lys Arg Ala Lys Pro Ala Val His Phe Gly 
35 40 45 

Gly Lys Phe Arg He He Asp Phe Ala Leu Ser Asn Cys He Asn Ser 
50 55 60 

Gly He Arg Arg Met Gly Val He Thr Gin Tyr Gin Ser His Thr Leu 
65 70 75 80 

Val Gin His He Gin Arg Gly Trp Ser Phe Phe Asn Glu Glu Met Asn 
85 90 95 

Glu Phe Val Asp Leu Leu Pro Ala Gin Gin Arg Met Lys Gly Glu Asn 
45 100 105 HO 

Trp Tyr Arg Gly Thr Ala Asp Ala Val Thr Gin Asn Leu Asp He He 
115 120 125 



30 



35 



40 



50 



55 



Arg Arg Tyr Lys Ala Glu Tyr Val Val He Leu Ala Gly Asp His He 
130 135 140 

Tyr Lys Gin Asp Tyr Ser Arg Met Leu He Asp His Val Glu Lys Gly 
145 150 155 160 

Val Arg Cys Thr Val Val Cys Met Pro Val Pro He Glu Glu Ala Ser 
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165 



170 



175 



Ala Phe Gly Val Met Ala Val Asp Glu Asn Asp Lys Thr lie Glu Phe 
180 185 190 

Val Glu Lys Pro Ala Asn Pro Pro Ser Met Pro Asn Asp Pro Ser Lys 
195 200 205 

Ser Leu Ala Ser Met Gly lie Tyr Val Phe Asp Ala Asp Tyr Leu Tyr 
210 215 220 

Glu Leu Leu Glu Glu Asp Asp Arg Asp Glu Asn Ser Ser His Asp Phe 
225 230 235 240 

Gly Lys Asp Leu lie Pro Lys lie Thr Glu Ala Gly Leu Ala Tyr Ala 
245 250 255 

His Pro Phe Pro Leu Ser Cys Val Gin Ser Asp Pro Asp Ala Glu Pro 
260 265 270 

Tyr Trp Arg Asp Val Gly Thr Leu Glu Ala Tyr Trp Lys Ala Asn Leu 
275 280 285 

Asp Leu Ala Ser Val Val Pro Glu Leu Asp Met Tyr Asp Arg Asn Trp 
290 295 300 

Pro He Arg Thr Tyr Asn Glu Ser Leu Pro Pro Ala Lys Phe Val Gin 
305 310 315 320 

Asp Arg Ser Gly Ser His Gly Met Thr Leu Asn Ser Leu Val Ser Asp 
325 330 335 

Gly Cys Val lie Ser Gly Ser Val Val Val Gin Ser Val Leu Phe Ser 
340 345 350 

Arg Val Arg Val Asn Ser Phe Cys Asn lie Asp Ser Ala Val Leu Leu 
355 360 365 

Pro Glu Val Trp Val Gly Arg Ser CyB Arg Leu Arg Arg Cys Val He 
370 375 380 

Asp Arg Ala Cys Val He Pro Glu Gly Met Val He Gly Glu Asn Ala 
385 390 395 400 

Glu Glu Asp Ala Arg Arg Phe Tyr Arg Ser Glu Glu Gly He Val Leu 
405 410 415 



Val Thr Arg Glu Met Leu Arg Lys Leu Gly His Lys Gin Glu Arg 
420 425 430 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 355 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
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10 



15 



20 



25 



35 



40 



(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 



fix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 88.. 354 



<x) PUBLICATION INFORMATION: 

(H) DOCUMENT NUMBER: EP 0536293 Al 

(I) FILING DATE: 07-JUN-1991 
(J) PUBLICATION DATE: 14-APR-1993 

(K) RELEVANT RESIDUES IN SEQ ID NO: 5t FROM 1 TO 355 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

AAGCTTGTTC TCATTGTTGT TATCATTATA TATAGATGAC CAAAGCACTA GACCAAACCT 60 

CAGTCACACA AAGAGTAAAG AAGAACA ATG GCT TCC TCT ATG CTC TCT TCC 111 

Met Ala ser Ser Met Leu Ser Ser 
1 5 

GCT ACT ATG GTT GCC TCT CCG GCT CAG GCC ACT ATG GTC GCT CCT TTC 159 
Ala Thr Met Val Ala Ser Pro Ala Gin Ala Thr Met Val Ala Pro Phe 
10 15 20 

30 GGA CTT AAG TCC TCC GCT GCC TTC CCA GCC ACC CGC AAG GCT AAC 207 

Asn Gly Leu Lys Ser Ser Ala Ala Phe Pro Ala Thr Arg Lys Ala Aen 
25 30 35 40 



AAC GAC ATT ACT TCC ATC ACA AGC AAC GGC GGA AGA GTT AAC TGC ATG 255 
Aen Asp lie Thr Ser He Thr Ser Asn Gly Gly Arg Val Asn Cys Met 
45 50 55 

CAG GTG TGG CCT CCG ATT GGA AAG AAG AAG TTT GAG ACT CTC TCT TAC 303 
Gin Val Trp Pro Pro He Gly Lys Lys Lys Phe Glu Thr Leu Ser Tyr 
60 65 70 

CTT CCT GAC CTT ACC GAT TCC GGT GGT CGC GTC AAC TGC ATG CAG GCC 351 
Leu Pro Asp Leu Thr Asp Ser Gly Gly Arg Val Asn Cys Met Gin Ala 
75 80 85 



45 ATG G 
Met 



50 (2) INFORMATION FOR SEQ ID NO: 6: 



355 



55 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 89 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 



(x) PUBLICATION INFORMATION: 
5 (H) DOCUMENT NUMBER: BP 0536293 Al 

(I) FILING DATE: 07-JUN-1991 
(J) PUBLICATION DATE: 14-APR-1993 

(K) RELEVANT RESIDUES IN SEQ ID NO: 6: FROM 1 TO 89 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Ala Ser Ser Met Leu Ser Ser Ala Thr Met Val Ala Ser Pro Ala 
15 10 15 

15 Gin Ala Thr Met Val Ala Pro Phe Asn Gly Leu Lys Ser Ser Ala Ala 
20 25 30 

Phe Pro Ala Thr Arg Lya Ala Asn Asn Asp He Thr Ser He Thr Ser 
35 40 45 



20 



25 



30 



40 



45 



50 



Asn Gly Gly Arg Val Asn Cys Met Gin Val Trp Pro Pro He Gly Lys 
50 55 60 

Lys Lys Phe Glu Thr Leu Ser Tyr Leu Pro Asp Leu Thr Asp Ser Gly 
65 70 75 80 

Gly Arg Val Asn Cys Met Gin Ala Met 
85 

(2) INFORMATION FOR SEQ ID NO: 7 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1575 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 3.. 1565 



(x) PUBLICATION INFORMATION: 

(H) DOCUMENT NUMBER: EP 0536293 Al 

(I) FILING DATE: 07-JUN-1991 

(J) PUBLICATION DATE: 14-APR-1993 

(K) RELEVANT RESIDUES IN SEQ ID NO: 7: FROM 1 TO 1575 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

CC ATG GCG GCT TCC ATT GGA GCC TTA AAA TCT TCA CCT TCT TCT AAC 47 
Met Ala Ala Ser He Gly Ala Leu Lys Ser Ser Pro Ser Ser Asn 
55 1 5 10 15 
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10 



15 



20 



AAT TGC ATC AAT GAG AGA AGA AAT GAT TCT ACA CGT GCT GTA TCC AGC 95 
Asn Cys He Aen Glu Arg Arg Aen Asp Ser Thr Arg Ala Val Ser Ser 
20 25 30 

AGA AAT CTC TCA TTT TCG TCT TCT CAT CTC GCC GGA GAC AAG TTG ATG 143 
Arg Asn Leu Ser Phe Ser Ser Ser Hie Leu Ala Gly Asp Lya Leu Met 
35 40 45 

CCT GTA TCG TCC TTA OCT TCC CAA GGA GTC CGA TTC AAT GTG AGA AGA 191 
Pro Val Ser Ser Leu Arg Ser Gin Gly Val Arg Phe Asn Val Arg Arq 
50 55 60 

AGT CCA ATG ATT GTG TCG CCA AAG GCT GTT TCT GAT TCG CAG AAT TCA 239 
Ser Pro Met He Val Ser Pro Lys Ala Val Ser Asp Ser Gin Asn Ser 
65 70 75 

CAG ACA TGT CTA GAC CCA GAT GCT AGC CGG AGT GTT TTG GGA ATT ATT 287 
Gin Thr Cys Leu Asp Pro Asp Ala Ser Arg Ser Val Leu Gly He He 
80 85 90 95 

CTT GGA GGT GGA GCT GGG ACC CGA CTT TAT CCT CTA ACT AAA AAA AGA 335 
Leu Gly Gly Gly Ala Gly Thr Arg Leu Tyr Pro Leu Thr Lys Lys Arg 
100 105 110 

25 GCA AAG CCA GCT GTT CCA CTT GGA GCA AAT TAT CGT CTG ATT GAC ATT 383 

Ala Lys Pro Ala Val Pro Leu Gly Ala Asn Tyr Arg Leu He Asp He 
!15 120 125 

CCT GTA AGC AAC TGC TTG AAC AGT AAT ATA TCC AAG ATT TAT GTT CTC 431 
Pro Val Ser Asn Cys Leu Asn Ser Asn He Ser Lys He Tyr Val Leu 
130 135 140 

ACA CAA TTC AAC TCT GCC TCT CTG AAT CGC CAC CTT TCA CGA GCA TAT 479 
Thr Gin Phe Asn Ser Ala Ser Leu Asn Arg His Leu Ser Arg Ala Tvr 
145 150 i 55 

GCT AGC AAC ATG GGA GGA TAC AAA AAC GAG GGC TTT GTG GAA GTT CTT 527 
Ala Ser Aan Met Gly Gly Tyr Lys Asn Glu Gly Phe Val Glu Val Leu 
160 165 170 175 

40 GCT GCT CAA CAA AGT CCA GAG AAC CCC GAT TGG TTC CAG GGC ACG GCT 575 
Ala Ala Gin Gin Ser Pro Glu Asn Pro Asp Trp Phe Gin Gly Thr Ala 
180 185 190 



30 



35 



45 



50 



55 



GAT GCT GTC AGA CAA TAT CTG TGG TTG TTT GAG GAG CAT ACT GTT CTT 623 
Asp Ala Val Arg Gin Tyr Leu Trp Leu Phe Glu Glu His Thr Val Leu 
195 200 205 

GAA TAC CTT ATA CTT GCT GGA GAT CAT CTG TAT CGA ATG GAT TAT GAA 671 
Glu Tyr Leu He Leu Ala Gly Asp His Leu Tyr Arg Met Asp Tyr Glu 
210 215 220 

AAG TTT ATT CAA GCC CAC AGA GAA ACA GAT GCT GAT ATT ACC GTT GCC 
Lys Phe He Gin Ala His Arg Glu Thr Asp Ala Asp He Thr Val Ala 
225 230 235 

GCA CTG CCA ATG GAC GAG AAG CGT GCC ACT GCA TTC GGT CTC ATG AAG 767 



719 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



Ala Leu Pro Met Asp Glu Lye Arg Ala Thr Ala Phe Gly Leu Met Lys 
240 245 250 255 

ATT GAC GAA GAA GGA CGC ATT ATT GAA TTT GCA GAG AAA CCG CAA GGA 815 
lie Asp Glu Glu Gly Arg lie He Glu Phe Ala Glu Lys Pro Gin Gly 
260 265 270 

GAG CAA TTG CAA GCA ATG AAA GTG GAT ACT ACC ATT TTA GGT CTT GAT 863 
Glu Gin Leu Gin Ala Met Lye Val Asp Thr Thr He Leu Gly Leu Asp 
275 280 285 

GAC AAG AGA GCT AAA GAA ATG CCT TTC ATT GCC AGT ATG GGT ATA TAT 911 
Asp Lys Arg Ala Lys Glu Met Pro Phe He Ala Ser Met Gly He Tyr 
290 295 300 

GTC ATT AGC AAA GAC GTG ATG TTA AAC CTA CTT CGT GAC AAG TTC CCT 959 
Val He Ser Lys Asp Val Met Leu Asn Leu Leu Arg Asp Lys Phe Pro 
305 310 315 

GGG GCC AAT GAT TTT GGT AGT GAA GTT ATT CCT GGT GCA ACT TCA CTT 1007 
Gly Ala Asn Asp Phe Gly Ser Glu Val He Pro Gly Ala Thr Ser Leu 
320 325 330 335 

GGG ATG AGA GTG CAA GCT TAT TTA TAT GAT GGG TAC TGG GAA GAT ATT 1055 
Gly Met Arg Val Gin Ala Tyr Leu Tyr Asp Gly Tyr Trp Glu Asp He 
340 345 350 

GGT ACC ATT GAA GCT TTC TAC AAT GCC AAT TTG GGC ATT AGA AAA AAG 1103 
Gly Thr He Glu Ala Phe Tyr Asn Ala Asn Leu Gly He Thr Lye Lys 
355 360 365 

CCG GTG CCA GAT TTT AGC TTT TAC GAC CGA TCA GCC CCA ATC TAC ACC 1151 
Pro Val Pro Asp Phe Ser Phe Tyr Asp Arg Ser Ala Pro He Tyr Thr 
370 375 380 

CAA CCT CGA TAT CTA CCA CCA TCA AAA ATG CTT GAT GCT GAT GTC ACA 1199 
Gin Pro Arg Tyr Leu Pro Pro Ser Lys Met Leu Asp Ala Asp Val Thr 
385 390 395 

GAT AGT GTC ATT GGT GAA GGT TGT GTG ATC AAG AAC TGT AAG ATT CAT 1247 
Asp Ser Val He Gly Glu Gly Cys Val He Lys Asn Cys Lys He His 
400 405 410 415 

CAT TCC GTG GTT GGA CTC AGA TCA TGC ATA TCA GAG GGA GCA ATT ATA 1295 
His Ser Val Val Gly Leu Arg Ser Cys He Ser Glu Gly Ala He He 
420 425 430 

GAA GAC TCA CTT TTG ATG GGG GCA GAT TAC TAT GAG ACT GAT GCT GAC 1343 
Glu Asp Ser Leu Leu Met Gly Ala Asp Tyr Tyr Glu Thr Asp Ala Asp 
435 440 445 

AGG AAG TTG CTG GCT GCA AAG GGC AGT GTC CCA ATT GGC ATC GGC AAG 1391 
Arg Lys Leu Leu Ala Ala Lys Gly Ser Val Pro He Gly He Gly Lys 
450 455 460 

AAT TGT CAC ATT AAA AGA GCC ATT ATC GAC AAG AAT GCC CGT ATA GGG 1439 
Asn Cys His He Lys Arg Ala He He Asp Lys Aen Ala Arg He Gly 
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465 470 475 

GAC AAT GTG AAG ATC ATT AAC AAA GAC AAC GTT CAA GAA GOG GOT AGG 1487 
5 Asp Asn Val Lys lie He Asn Lys Asp Asn Val Gin Glu Ala Ala Arg 
480 485 490 495 

GAA ACA GAT GGA TAC TTC ATC AAG ACT GGG ATT GTC ACC GTC ATC AAG 1535 
Glu Thr Asp Gly Tyr Phe He Lys Ser Gly lie Val Thr Val He Lys 
10 500 505 510 

GAT GCT TTG ATT CCA AGT GGA ATC ATC ATC TGATGAGCTC 1575 
Asp Ala Leu He Pro Ser Gly He He He 
515 520 



15 



20 



25 



40 



50 



55 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 521 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(x) PUBLICATION INFORMATION: 

(H) DOCUMENT NUMBER: EP 0536293 Al 

(I) FILING DATE: 07-JUN-1991 

(J) PUBLICATION DATE: 14-APR-1993 
30 (K) RELEVANT RESIDUES IN SEQ ID NO: 8: FROM 1 TO 521 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Ala Ala Ser He Gly Ala Leu Lye Ser Ser Pro Ser Ser Asn Asn 
35 1 5 10 15 

Cys He Asn Glu Arg Arg Asn Asp Ser Thr Arg Ala Val Ser Ser Arg 
20 25 30 

Asn Leu Ser Phe Ser Ser Ser His Leu Ala Gly Asp Lys Leu Met Pro 
35 40 45 

Val Ser Ser Leu Arg Ser Gin Gly Val Arg Phe Asn Val Arg Arg Ser 
50 55 60 

45 Pro Met He Val Ser Pro Lys Ala Val Ser Asp Ser Gin Asn Ser Gin 
65 . 70 75 80 

Thr Cys Leu Asp Pro Asp Ala Ser Arg Ser Val Leu Gly He He Leu 
85 90 95 



Gly Gly Gly Ala Gly Thr Arg Leu Tyr Pro Leu Thr Lys Lys Arg Ala 
100 105 no 

Lys Pro Ala Val Pro Leu Gly Ala Asn Tyr Arg Leu He Asp He Pro 
115 120 125 



24 



10 



15 



20 



EP 0 634 491 A1 



Val Ser Asn Cye Leu Asn Ser Asn lie Ser Lye lie Tyr Val Leu Thr 
130 135 140 

Gin Phe Asn Ser Ala Ser Leu Asn Arg His Leu Ser Arg Ala Tyr Ala 
145 150 155 160 

Ser Asn Met Gly Gly Tyr Lys Asn Glu Gly Phe Val Glu Val Leu Ala 
165 170 175 

Ala Gin Gin Ser Pro Glu Asn Pro Asp Trp Phe Gin Gly Thr Ala Asp 
180 185 190 

Ala Val Arg Gin Tyr Leu Trp Leu Phe Glu Glu His Thr Val Leu Glu 
195 200 205 

Tyr Leu lie Leu Ala Gly Asp His Leu Tyr Arg Met Asp Tyr Glu Lys 
210 215 220 

Phe lie Gin Ala His Arg Glu Thr Asp Ala Asp lie Thr Val Ala Ala 
225 230 235 240 

Leu Pro Met Asp Glu Lys Arg Ala Thr Ala Phe Gly Leu Met Lys lie 
245 250 255 

25 Asp Glu Glu Gly Arg He He Glu Phe Ala Glu Lys Pro Gin Gly Glu 

260 265 270 

Gin Leu Gin Ala Met Lys Val Asp Thr Thr He Leu Gly Leu Asp Asp 
275 280 285 

30 

Lys Arg Ala Lys Glu Met Pro Phe He Ala Ser Met Gly He Tyr Val 
290 295 300 

He Ser Lys Asp Val Met Leu Asn Leu Leu Arg Asp Lys Phe Pro Gly 
305 310 315 320 

Ala Asn Asp Phe Gly Ser Glu Val He Pro Gly Ala Thr Ser Leu Gly 
325 330 335 

Met Arg Val Gin Ala Tyr Leu Tyr Asp Gly Tyr Trp Glu Asp He Gly 
340 345 350 

Thr He Glu Ala Phe Tyr Asn Ala Asn Leu Gly He Thr Lys Lys Pro 
355 360 365 

45 val Pro Asp Phe Ser Phe Tyr Asp Arg Ser Ala Pro He Tyr Thr Gin 

370 375 380 

Pro Arg Tyr Leu Pro Pro Ser Lys Met Leu Asp Ala Asp Val Thr Asp 
385 390 395 400 



35 



40 



50 



55 



Ser Val He Gly Glu Gly Cys Val He Lys Asn Cys Lys He His His 

405 410 415 

Ser Val Val Gly Leu Arg Ser Cys He Ser Glu Gly Ala He He Glu 

420 425 430 
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10 



15 



20 



35 



40 



50 



55 



Aep Ser Leu Leu Met Gly Ala Asp Tyr Tyr Glu Thr Asp Ala Asp Arg 
435 440 445 

Lye Leu Leu Ala Ala Lys Gly Ser Val Pro lie Gly He Gly Lys Asn 
450 455 460 

eye His He Lys Arg Ala He He Asp Lye Asn Ala Arg He Gly Asp 
465 470 475 480 

Asn Val Lys He He Asn Lys Asp Asn Val Gin Glu Ala Ala Arg Glu 
485 490 495 

Thr Asp Gly Tyr Phe He Lys Ser Gly He Val Thr Val He Lys Asp 
500 505 510 

Ala Leu He Pro Ser Gly He He He 
515 520 

(2) INFORMATION FOR SEQ ID NO: 9: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1519 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



M <ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1 . . 1410 



(x) PUBLICATION INFORMATION: 

(H) DOCUMENT NUMBER: EP 0536293 Al 

(I) FILING DATE: 07-JUN-1991 

(J) PUBLICATION DATE: 14-APR-1993 

(K) RELEVANT RESIDUES IN SEQ ID NO: 9: FROM 1 TO 1519 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



AAC AAG ATC AAA CCT GGG GTT GCT TAC TCT GTG ATC ACT ACT GAA AAT 48 
Asn Lys He Lys Pro Gly Val Ala Tyr Ser Val He Thr Thr Glu Aan 

45 1 5 10 15 . . 

GAC ACA CAG ACT GTG TTC GTA GAT ATG CCA CGT CTT GAG AGA CGC CGG 96 
Asp Thr Gin Thr Val Phe Val Asp Met Pro Arg Leu Glu Arg Arg Arg 
20 25 30 



GCA AAT CCA AAG GAT GTG GCT GCA GTC ATA CTG GGA GGA GGA GAA GGG 144 
Ala Asn Pro Lys Asp Val Ala Ala Val He Leu Gly Gly Gly Glu Gly 
35 40 45 

ACC AAG TTA TTC CCA CTT ACA AGT AGA ACT GCA ACC CCT GCT GTT CCG 192 
Thr Lys Leu Phe Pro Leu Thr Ser Arg Thr Ala Thr Pro Ala Val Pro 
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50 55 60 

GTT GGA GGA TGC TAC AGG CTA ATA GAC ATC CCA ATG AGC AAC TGT ATC 240 
Val Gly Gly Cys Tyr Arg Leu He Asp He Pro Met Ser Asn CyB He 
65 70 75 80 

AAC AGT GCT ATT AAC AAG ATT TTT GTG CTG ACA CAG TAC AAT TCT GCT 288 
Asn Ser Ala He Asn Lye He Phe Val Leu Thr Gin Tyr Asn Ser Ala 
85 90 95 

CCC CTG AAT CGT CAC ATT GCT CGA ACA TAT TTT GGC AAT GGT GTG AGC 336 
Pro Leu Asn Arg His He Ala Arg Thr Tyr Phe Gly Asn Gly Val Ser 
100 105 110 

TTT GGA GAT GGA TTT GTC GAG GTA CTA GCT GGA ACT CAG ACA CCC GGG 384 
Phe Gly Asp Gly Phe Val Glu Val Leu Ala Ala Thr Gin Thr Pro Gly 
115 120 125 

GAA GCA GGA AAA AAA TGG TTT CAA GGA ACA GCA GAT GCT GTT AGA AAA 432 
Glu Ala Gly Lys Lys Trp Phe Gin Gly Thr Ala Asp Ala Val Arg Lys 
130 135 140 

TTT ATA TGG GTT TTT GAG GAC GCT AAG AAC AAG AAT ATT GAA AAT ATC 480 
Phe He Trp Val Phe Glu Asp Ala Lys Asn Lys Asn He Glu Asn He 
145 150 155 160 

GTT GTA CTA TCT GGG GAT CAT CTT TAT AGG ATG GAT TAT ATG GAG TTG 528 
Val Val Leu Ser Gly Asp His Leu Tyr Arg Met Asp Tyr Met Glu Leu 
165 170 175 

30 GTG CAG AAC CAT ATT GAC AGG AAT GCT GAT ATT ACT CTT TCA TGT GCA 576 
Val Gin Asn His He Asp Arg Asn Ala Asp He Thr Leu Ser Cys Ala 
180 185 190 



10 



15 



20 



25 



35 



40 



CCA GCT GAG GAC AGC CGA GCA TCA GAT TTT GGG CTG GTC AAG ATT GAC 624 
Pro Ala Glu Asp Ser Arg Ala Ser Asp Phe Gly Leu Val Lys He Asp 
195 200 205 

AGC AGA GGC AGA GTA GTC CAG TTT GCT GAA AAA CCA AAA GGT TTT GAT 672 
Ser Arg Gly Arg Val Val Gin Phe Ala Glu Lys Pro Lys Gly Phe Asp 
210 215 220 

CTT AAA GCA ATG CAA GTA GAT ACT ACT CTT GTT GGA TTA TCT CCA CAA 720 
Leu Lys Ala Met Gin Val Asp Thr Thr Leu Val Gly Leu Ser Pro Gin 
225 230 235 240 

45 GAT GCG AAG AAA TCC CCC TAT ATT GCT TCA ATG GGA GTT TAT GTA TTC 768 
Asp Ala Lys Lys Ser Pro Tyr He Ala Ser Met Gly Val Tyr Val Phe 
245 250 255 

AAG ACA GAT GTA TTG TTG AAG CTC TTG AAA TGG AGC TAT CCC ACT TCT 816 
so Lys Thr Asp Val Leu Leu Lys Leu Leu Lys Trp Ser Tyr Pro Thr Ser 
260 265 270 

AAT GAT TTT GGC TCT GAA ATT ATA CCA GCA GCT ATT GAC GAT TAC AAT 864 
Asn Asp Phe Gly Ser Glu He He Pro Ala Ala He Asp Asp Tyr Asn 
55 275 280 285 
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GTC CAA GCA TAC ATT TTC AAA GAC TAT TGG GAA GAC ATT GGA ACA ATT 912 
Val Gin Ala Tyr lie Phe Lys Aep Tyr Trp Glu Asp He Gly Thr He 
290 295 300 

AAA TCG TTT TAT AAT GCT AGC TTG GCA CTC ACA CAA GAG TTT CCA GAG 960 
Lye Ser Phe Tyr Asn Ala Ser Leu Ala Leu Thr Gin Glu Phe Pro Glu 
305 310 315 320 

TTC CAA TTT TAC GAT CCA AAA ACA CCT TTT TAC ACA TCT CCT AGG TTC 1008 
Phe Gin Phe Tyr Asp Pro Lys Thr Pro Phe Tyr Thr Ser Pro Arg Phe 
325 330 335 

CTT CCA CCA ACC AAG ATA GAC AAT TGC AAG ATT AAG GAT GCC ATA ATC 1056 
Leu Pro Pro Thr Lys He Asp Asn Cys Lys He Lys Asp Ala He He 
340 345 350 

TCT CAT GGA TGT TTC TTG CGA GAT TGT TCT GTG GAA CAC TCC ATA GTG 1104 
Ser His Gly Cys Phe Leu Arg Asp Cys Ser Val Glu His Ser He Val 
355 360 365 

20 GGT GAA AGA TCG CGC TTA GAT TGT GGT GTT GAA CTG AAG GAT ACT TTC 1152 
Gly Glu Arg Ser Arg Leu Asp Cys Gly Val Glu Leu Lys Asp Thr Phe 
370 375 380 

ATG ATG GGA GCA GAC TAC TAC CAA ACA GAA TCT GAG ATT GCC TCC CTG 1200 
25 Net Met Gly Ala Asp Tyr Tyr Gin Thr Glu Ser Glu He Ala Ser Leu 
385 390 395 400 

TTA GCA GAG GGG AAA GTA CCG ATT GGA ATT GGG GAA AAT ACA AAA ATA 1248 
Leu Ala Glu Gly Lys Val Pro He Gly He Gly Glu Asn Thr Lys lie 
405 410 415 

AGG AAA TGT ATC ATT GAC AAG AAC GCA AAG ATA GGA AAG AAT GTT TCA 1296 
Arg Lys Cys He He Asp Lys Asn Ala Lys lie Gly Lys Asn Val Ser 
420 425 430 

ATC ATA AAT AAA GAC GGT GTT CAA GAG GCA GAC CGA CCA GAG GAA GGA 1344 
He lie Asn Lys Asp Gly Val Gin Glu Ala Asp Arg Pro Glu Glu Gly 
435 440 445 

TTC TAC ATA CGA TCA GGG ATA ATC ATT ATA TTA GAG AAA GCC ACA ATT 1392 
40 Phe Tyr lie Arg Ser Gly lie lie lie lie Leu Glu Lys Ala Thr lie 
450 455 460 

AGA GAT GGA ACA GTC ATC TGAACTAGGG AAGCACCTCT TGTTGAACTA 1440 
Arg Asp Gly Thr Val lie 
45 465 470 

CTGGAGATCC AAATCTCAAC TTGAAGAAGG TCAAGGGTGA TCCTAGCACG TTCACCAGTT 1500 

GACTCCCCGA AGGAAGCTT 1519 
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(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 470 amino acids 
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(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPB: protein 



(x) PUBLICATION INFORMATION: 

(H) DOCUMENT NUMBBR: EP 0536293 Al 

(I) PILING DATE: 07-JUN-1991 

10 (J) PUBLICATION DATE: 14-APR-1993 

(K) RELEVANT RESIDUES IN SEQ ID NO: 10: FROM 1 TO 470 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Asn Lys lie Lys Pro Gly Val Ala Tyr Ser Val lie Thr Thr Glu Asn 
15 10 15 

Asp Thr Gin Thr Val Phe Val Asp Met Pro Arg Leu Glu Arg Arg Arg 
20 25 30 

Ala Asn Pro Lys Asp Val Ala Ala Val He Leu Gly Gly Gly Glu Gly 
35 40 45 

Thr Lys Leu Phe Pro Leu Thr Ser Arg Thr Ala Thr Pro Ala Val Pro 
25 50 55 60 

Val Gly Gly Cys Tyr Arg Leu He Asp He Pro Met Ser Asn Cys He 
65 70 75 80 

30 . Asn Ser Ala He Asn Lys He Phe Val Leu Thr Gin Tyr Asn Ser Ala 

85 90 95 

Pro Leu Asn Arg His He Ala Arg Thr Tyr Phe Gly Asn Gly Val Ser 
100 105 110 

35 

Phe Gly Asp Gly Phe Val Glu Val Leu Ala Ala Thr Gin Thr Pro Gly 
115 120 125 

Glu Ala Gly Lys Lys Trp Phe Gin Gly Thr Ala Asp Ala Val Arg Lys 
130 135 140 

Phe He Trp Val Phe Glu Asp Ala Lys Asn Lys Asn He Glu Asn He 
145 150 155 160 

Val Val Leu Ser Gly Asp His Leu Tyr Arg Met Asp Tyr Met Glu Leu 
45 165 170 . . 175 

Val Gin Asn His He Asp Arg Asn Ala Asp He Thr Leu Ser Cys Ala 
180 185 190 

so Pro Ala Glu Asp Ser Arg Ala Ser Asp Phe Gly Leu Val Lys He Asp 

195 200 205 

Ser Arg Gly Arg Val Val Gin Phe Ala Glu Lys Pro Lys Gly Phe Asp 
210 215 220 

55 
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Leu Lys Ala Het Gin Val Asp Thr Thr Leu Val Gly Leu Ser Pro Gin 
225 230 235 240 

Asp Ala Lys Lys Ser Pro Tyr He Ala Ser Met Gly Val Tyr Val Phe 
5 245 250 255 

Lys Thr Asp Val Leu Leu Lys Leu Leu Lys Trp Ser Tyr Pro Thr Ser 
260 265 270 

to Asn Asp Phe Gly Ser Glu He He Pro Ala Ala He Asp Asp Tyr Asn 

275 280 285 

Val Gin Ala Tyr He Phe Lye Asp Tyr Trp Glu Asp He Gly Thr He 
290 295 300 

15 

Lys Ser Phe Tyr Asn Ala Ser Leu Ala Leu Thr Gin Glu Phe Pro Glu 
305 310 315 320 

Phe Gin Phe Tyr Asp Pro Lys Thr Pro Phe Tyr Thr Ser Pro Arg Phe 
325 330 335 

Leu Pro Pro Thr Lye He Asp Asn Cys Lys He Lys Asp Ala lie He 
340 345 350 

Ser His Gly Cys Phe Leu Arg Asp Cys Ser Val Glu His Ser He Val 
25 355 360 365 

Gly Glu Arg Ser Arg Leu Asp Cys Gly Val Glu Leu Lys Asp Thr Phe 
370 375 380 

50 Met Met Gly Ala Asp Tyr Tyr Gin Thr Glu Ser Glu He Ala Ser Leu 

385 390 395 400 

Leu Ala Glu Gly Lys Val Pro He Gly He Gly Glu Asn Thr Lys He 
405 410 415 

35 

Arg Lys Cys He He Asp Lys Asn Ala LyB lie Gly Lys Asn Val Ser 
420 425 430 

He He Asn Lys Asp Gly Val Gin Glu Ala Asp Arg Pro Glu Glu Gly 
435 440 445 

Phe Tyr He Arg Ser Gly He He He He Leu Glu Lys Ala Thr He 
450 455 460 

Arg Asp Gly Thr Val lie 
45 465 470 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH : 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (synthetic) 
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(x) PUBLICATION INFORMATION : 

(H) DOCUMENT NUMBER: EP 0536293 Al 

(I) FILING DATE: 07-JUN-1991 

(J) PUBLICATION DATE: 14-APR-1993 

(K) RELEVANT RESIDUES IN SEQ ID NO: IX: FROM 1 TO 35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GTTGATAACA AGATCTGTTA ACCATGGCGG CTTCC 35 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (synthetic) 



(x) PUBLICATION INFORMATION: 

<H) DOCUMENT NUMBER: EP 0536293 Al 
25 (I) FILING DATE: 07-JUN-1991 

(J) PUBLICATION DATE: 14-APR-1993 

(K) RELEVANT RESIDUES IN SEQ ID NO: 12: FROM 1 TO 33 



30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

CCAGTTAAAA CGGAGCTCAT CAGATGATGA TTC 33 
(2) INFORMATION FOR SEQ ID NO: 13: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 



(X) PUBLICATION INFORMATION: 
45 (H) DOCUMENT NUMBER: EP 0536293 Al 

(I) FILING DATE: 07-JUN-1991 
(J) PUBLICATION DATE: 14-APR-1993 

<K) RELEVANT RESIDUES IN SEQ ID NO: 13: FROM 1 TO 30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GTGTGAGAAC ATAAATCTTG GATATGTTAC 30 
(2) INFORMATION FOR SEQ ID NO: 14: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 



(x) PUBLICATION INFORMATION: 

(H) DOCUMENT NUMBER: EP 0536293 Al 

(I) FILING DATE: 07-JUN-1991 

(J) PUBLICATION DATE: 14-APR-1993 

(K) RELEVANT RESIDUES IN SEQ ID NO: 14: FROM 1 TO 28 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GAATTCACAG GGCCATGGCT CTAGACCC 28 
(2) INFORMATION FOR SEQ ID NO: 15: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 40 base pairs 
25 (B) TYPE: nucleic acid 

<C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (synthetic) 



(x) PUBLICATION INFORMATION: 

(H) DOCUMENT NUMBER: EP 0536293 Al 

(I) FILING DATE: 07-JUN-1991 

(J) PUBLICATION DATE: 14-APR-1993 

(K) RELEVANT RESIDUES IN SEQ ID NO: 15: FROM 1 TO 40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

40 AAGATCAAAC CTGCCATGGC TTACTCTGTG ATCACTACTG 40 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

w (ii) MOLECULE TYPE: DNA (synthetic) 

(x) PUBLICATION INFORMATION: 

(H) DOCUMENT NUMBER: EP 0536293 Al 

(I) FILING DATE: 07-JUN-1991 

55 (J) PUBLICATION DATE: 14-APR-1993 
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(K) RELEVANT RESIDUES IN SEQ ID NO: 16: FROM 1 TO 39 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
GGGAATTCAA GCTTGGATCC CGGGCCCCCC CCCCCCCCC 39 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 



(x) PUBLICATION INFORMATION: 

(H ) DOCUMENT NUMBER: EP 0536293 Al 
20 (I) FILING DATE: 07-JUN-1991 

(J) PUBLICATION DATE: 14-APR-1993 

(K) RELEVANT RESIDUES IN SEQ ID NO: 17: FROM 1 TO 24 



25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

GGGAATTCAA GCTTGGATCC CGGG 24 
(2) INFORMATION FOR SEQ ID NO: 18: 



30 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 



(x) PUBLICATION INFORMATION: 
AO (H) DOCUMENT NUMBER: EP 0536293 Al 

(I) FILING DATE: 07-JUN-1991 
(J) PUBLICATION DATE: 14-APR-1993 

(K) RELEVANT RESIDUES IN SEQ ID NO: 18: FROM 1 TO 32 



45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
CCTCTAGACA GTCGATCAGG AGCAGATGTA CG 32 



50 



Claims 



1. A method of producing plant seeds having decreased oil content comprising providing increased levels 
55 of ADPglucose pyrophosphorytase within said seeds by transforming said plant using the following steps: 

(a) inserting into the genome of a plant ceil a recombinant, double-stranded DNA molecule comprising 

(i) a promoter which functions in plants to cause the production of an RNA sequence in plant seeds, 

(ii) a structural DNA sequence that causes the production of an RNA sequence which encodes a 
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fusion polypeptide comprising an amino-terminal ptastid transit peptide and an ADPglucose pyro- 
phosphorylase enzyme, 

(iii) a 3' non-translated DNA sequence which functions in plant cells to cause transcriptional ter- 
mination and the addition of polyadenyiated nucleotides to the 3' end of the RNA sequence; 

(b) obtaining transformed plant cells; and 

(c) regenerating from the transformed plant cells genetically transformed plants which produce seeds 
having a decreased oil content; 

wherein said ADPglucose pyrophosphoryiase enzyme is deregulated. 

The method of claim 1 wherein said enzyme is from E colL 
The method of claim 2 wherein said enzyme is g/gC16. 

The method of claim 3 wherein said plant is selected from the group consisting of wheat, canola, soybean, 
corn, cotton, sunflower, almond, cashew, pecan, walnut, and peanut 
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